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Abstract 

Several particle algorithms admit a Feynman-Kac representation such 
that the potential function may be expressed as a recursive function which 
depends on the complete state trajectory. An important example is the 
mixture Kalman filter, but other models and algorithms of practical inter- 
est fall in this category. We study the asymptotic stability of such particle 
algorithms as time goes to infinity. As a corollary, practical conditions for 
the stability of the mixture Kalman filter, and a mixture GARCH filter, 
are derived. Finally, we show that our results can also lead to weaker 
conditions for the stability of standard particle algorithms, such that the 
potential function depends on the last state only. 



1 Introduction 

The most c ommo n application of the theory of Feynman-Kac formulae (see e.g. 
Del Moral liofil is nonlinear filtering of a hidden Markov chain (A„), based 
on observed process (Y n ). In such settings, the potential function at time n 
typically depends only on the current state A„. The uniform stability of the 
corresponding particle approximations can be obtained under appropriate con- 
ditions, see Section 7.4.3 of the aforementioned book and references therein. 
For a good overview of the theoretical and methodological aspects of particle 
approximatio n algo ri thms, al s o kno wn as particle filte r ing al gorithms, see also 
Doucet et all (|200lh . iKunscbl ()200lh . and lCappe et alJ l|2005t ). 

They are however several applications of practical interest where the po- 
tential function depends on the complete state trajectory A 0:n = (A , . . . , A„). 
The corresponding particle filtering algorithms still have a fixed computational 
cost per iteration, because the potential can be computed using recursive for- 
mulae. An important example is the class of conditional linear Gaussian dy- 
namic models, where the conditioning is on some unobserved Markov chain A„. 
The corresponding p article algorithm is known as the mixture Kalman filter 

n and 2000, see also Example 7 in lPoucet et alll200fjL and lAndrieu and Doucetl . 
20021 . for a related algorithm): the potential function at time n is then a 
Gaussian density, the parameters of which are computed recursively using the 
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Kalman-Bucy filter i Kalman and Bucv| . 1961 ). Another example is the mixture 
GARCH model considered in lChopinl (|2007t ) . 

It is worth noting that these models such that the potential functions are 
path-dependent can often be reformulated as a standard hidden Markov model, 
with a potential function depending on the last state only, by adding compo- 
nents to the hidden Markov chain. For instance, the mixture Kalman filter may 
be interpreted as a standard particle filtering algorithm, provided the hidden 
Markov process is augmented with the associated Kalman filter parameters (fil- 
tering expectation and error covariance matrix) that are computed iteratively 
in the algorithm. However, this representation is unwieldy, and the augmented 
Markov process does not fulfil the usual mixing conditions found in the litera- 
ture on the stability of particle approximations. This is the main reason why our 
study is based on path-dependent potential functions. Quite interestingly, we 
shall see that the opposite perspective is more fruitful. Specifically, our stabil- 
ity results obtained for path-dependent potential functions can also be applied 
to standard state-space models, leading to stability results under conditions 
different from those previously given in the literature. 

In this paper, we study the asymptotic stability of particle algorithms based 
on path-dependent potential functions. We work under the assumption that 
the dependence of potential n on state n — p vanishes exponentially in p. This 
assumption is met in practical settings because of the recursive nature of the 
potential functions. Our proofs are based on the following construction: the true 
filter is compared with an approximate filter associated to 'truncated' potentials, 
that is potentials that depend only on \ n -p+i-.n, the vector of the last p states, 
for some well-chosen integer p. Then, we compare the truncated filter with its 
particle approximation, using the fact the 'truncated' filter corresponds to a 
standard Feynman-Kac model with a Markov chain of fixed dimension. Finally, 
we use a coupling construction to compare the particle approximations of the 
true filter and the truncated filter. In this way, we obtain estimates of the 
stability of the particle algorithm of interest. We apply our results to the two 
aforementioned classes of models, and obtain practical conditions under which 
the corresponding particle algorithms are stable uniformly in time. 

The paper is organised as follows. Section [2] introduces the model and the 
notations. Section[3]evaluates the local error induced by the truncation. Section 
H] studies the mixing properties of the truncated filter. Section [5] studies the 
propagation of the truncation error. Section [6] develops a coupling argument for 
the two particle systems. Section [7] states the main theorem of the paper, which 
provides a bound for the particle error and derives time-uniform estimates for 
the long-term propagation of the error in the particle approximation of the true 
model. Section [8] applies these results to two particle algorithms of practical 
interest, namely, the mixture Kalman filter, and the mixture GARCH filter, 
and shows how these results can be adapted to standard state-space models, 
such that the potential function depends only on the last state. 



2 Model and notations 

We consider a hidden Markov model, with latent (non-observed) state process 
{A n ,n > 0}, and observed process {Y n ,n > 1}, taking values respectively in 
a complete separable metric space E and in F = M. d . The state process is 
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an inhomogeneous Markov chain, with initial probability distribution £> and 
transition kernel Q n . The observed process Y n admits ^n(j/n|s/i:n— 1> ^O-.n) as 
a conditional probability density (with respect to an appropriate dominating 
measure) given Ao : „ = Ao :rl and Yi : „_i = yi :n -i, where the short-hand Vo- n 
for any symbol v stands for the vector (v , . . . , v n ). As explained in the In- 
troduction, this quantity depends on the entire path Ao :n , rather than the last 
state A„. Following common practice, we drop dependencies on the y n 's in the 
notations, as the observed sequence yo :n may be considered as fixed, and use 
the short-hand ^(Ao^) = ^n(yn\yo-.n-i', ^o-.n) ■ The model admits a Feynman- 
Kac representation which we describe fully in (|2.ip . We consider the following 
assumptions. 

Hypothesis 1. For all n > 1, the kernel Q n is mixing, i.e. there exists e n € 
(0, 1) such that 

en^A) < Q„(Aft-i,A) < —C(A) 
for some £ € SA+(E), and for any Borel set A C E, any \ n -\ G E. 

Hypothesis 2. For p large enough, and all n > p, there exists a 'truncated' 
potential function ^(A n _ P 4-i :n ) that depends on the last p states only, and that 
approximates ^ n in the sense that 

|*»(A0:») - ^(A„_ p+ l:„)| < (j>nT P {*„(A 0: „) A ^(A„_ p+1: „)} 

for some constants 4> n and t, <f> n > 0, < r < 1, and all Aq :t[ € E n+ , For 
convenience, we abuse notations and set = ^ n for p > n. 

Hypothesis 3. There exists constants a n ,b n , n > Q, a n > 1, b n > 1, such that 

— < *n(A :„) < b n , —< «*(A(„_p+l) + :n ) < b n 

for all Aom € E n+1 , using the short-hand k + = k V for any integer k. 

The constants a n and <j> n depend implicitly on the realisation y 1:n of the 
obs erved proce s s. Hy potheses 1 and 3 are standard in the filtering literature; see 
e.g. Del Moral ( 2004 ). Hypothesis 2 formalises the fact that potential functions 
are computed using iterative formulae, and therefore should forget past states 
at an exponential rate. One may take \fr£(A n _p+i :n ) = ^ n (z, ■ ■ . , z, \ n -p+i-.n) 
for instance, where z is an arbitrary element of E. We shall work out, in several 
models of interest, practical conditions under which Hypothesis 2 is fulfilled in 
Section [8l 

We introduce the following notations for the forward kernels, for n > 1: 

7n(Ao:n-li^Ao :n ) = 5a 0: „-i (^A 0:n _ 1 )Q n (A„_i, dA^)* n (A 0:n ) 

where S\ Q . n _ 1 is the Dirac measure centred at Ao :n -i- The above kernels implic- 
itly defines operators on measures and on test functions, i.e., 

7raM/) = <7nM) /) = / ^( rf A :, l -l)7ri(Ao:r l -l,rfAo : „)/(Ao : „), 
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for any (i e M+{E n+1 ), any test function / : E n+1 — ► [0,1], where M+(E k ) 
denotes the set of nonnegative measures w.r.t. E k , and V{E k ) the set of prob- 
ability measures w.r.t. E k . 

We associate to j n a "normalised" operator R n , such that, for any fi e 
M+{E n ), R n fj, is defined as: 

7? ,.(f\ - 

ttn Aj) - 7TT 

7nj"(l) 

for any / : E n+1 — > M + . Both the 7„'s and the i?„'s may be iterated using the 
following short-hands, for 1 < k < n: 

lk:n^ = In ■ ■ ■ JkH, Rk-.nH = Rn ■ ■ ■ Rk^- 

We have the following Feynman-Kac representation: 

E(/(A 0: „)|F 1: „ = y 1:n ) = R 1:n <:(f) , (2.1) 

VVt, V/ : E n+1 — ► M + , where, as mentioned above, C the law of Ao. 

Finally, we denote the total variation norm on nonnegative measures by 
|| • \\tv s the supremum norm on bounded functions by 1 1 • 1 1 oo , a nd the Hilbert met- 
ric by h(u , //) for any pair u, p! £ M + (E k ), k > 1; see e.g. Atar and Zeitounil 
<|l997h or iLe Gland and Oudjanel (|2004h . Definition 3.3. We recall that the 
Hilbert metric is scale invariant, and is related to the total variation norm in 
the following way, see e.g. Lemma 3.4 in lLe Gland and Oudjan 1 (120041 ): 

\\h-Ij,'\\tv < t^Kh^') (2.2) 
log 3 

h(Kfi,Kfi') < i|| M -//|| TV (2.3) 

provided if is a e-mixing kernel. We can also derive the following properties 
from the definition of h (Vfc e N*, V/x,//' G M(E k )): 

Vkernel Q, h(Qfi, Qp!) < h(fi, //) , (2.4) 
Vnonnegative function ip, h(ipfi,ipfi f ) < h(p, //) (2-5) 

with an equality in the latter equation if i\> is positive. 



3 Local error induced by truncation 

Until further notice, p is a fixed integer such that p > 2 and such that Hypothesis 
2 holds. Since our proofs involve a comparison between the true filter and a 
'truncated' filter, we introduce the projection operator which, for n > p, 
associates to any measure fi(dXo :n ) € M+(E n+1 ) its marginal w.r.t. its last p 
components, i.e. : 

mriif) = / /i(dA 0: „)/(A„- p+ i :n ) 

for any / : E p — > M; for p > n, let = We also define the following 

'truncated' forward kernels, for n > p: 

7n(An-p:n-lj dX n _ p+1:n ) 

= ^A„-p + i:„-i(^A„_ p+1 . n _ 1 )Q„(A n _i,dA„)4 , ^(A n _ p+1 .„) 
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and the associated normalised operators, for [i € M. + {E P ), f : E p — > K + : 

7>(/) 



KKf) 



and set 7^ = 7„, i?P = i?„ for n < p. From now on, we will refer to the filter 
associated to these 'truncated' operators as the truncated filter. 
We now evaluate the local error induced by the truncation. 



Lemma 1 



. For all 1 < k < n, and for all fi e M+(E k ), 



TV 



< 2<f> k r p . 



Proof. Let / : E ph( - n+v > -> [0, 1]. One has 

£>P ttP R ,,(f\ _ 7fc + l:n g fc7fcM/) 

R k+ , n H k RMf) rk+1 . nHMl) 



where 

lk+l:n H klMf) 

and 
hence 



/ M(dAo:fc-i)Q/c(Afc_i, c?Afe)5 , fe(Ao : fc)/(A(„_ p+1 )+ :n ) 

n 



/ fJ,(d\ :k-l)Qk( X k-l, d\ k )^l(Xk-p+l:k)f(\ n— p+l)+ :n) 

n 

[] [Qi(A i _i,dA i )*?(A (i _ p+1)+:i )" 

i=fc+l 



|7fe+l : „^7fe/z(/) - 7fc:n#fc_lM/)| 

< / ii(d\ ; k -i)Q k (\ k -i,d\ k ) *fc(A 0: fe) - *fc(A (fe _ p+ i)+ :A; ) 

n 

X/(V-p+l)+:fc) II [ ( 9«( A »-l' dA j)*f( A (*-p+l)+:») 
i=fe+l 

< </>feT P / M(rf A 0:fe-l)Q*;(Afe-l,rfA fe )(1' fc (Ao : fe) A *^(A (fe _ p+1) + :fe )) 

n 

X/(A(fe-p+l)+ :fe ) ]J [Ql( X i-l, dX *)y P (\l- P +l)+:l) 
i=k+l 

< 4>kT P {7fe+l : „^7feM(/) A 7fc:n#fc-l7fc-lM/)} 

according to Hypothesis 2. And, since, for all a, 6, c, d G M + such that a/6 < 1 
and c/d < 1, 
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a c 



< 



\a-c\ \d-b\ 



(3.6) 



one may conclude directly by taking a = y k+1:n H%j k fj,(f), b = 7 fc:n _ff^_ 1 7 fe _i/i(l), 
c = % + l:n H Mfl ^d d = 7L^LiM(l)- □ 

Lemma 2. For k > 1, if there exists a (possibly random) probability kernel 
R k : E kA P -> 7?( J B( fe + 1 ) A P) SMC /i ifca*, /or aii ^ G V(E kA P), 

sup E ( (Jj£/i - R kfi , f))<S k 

/:|I/IU = 1 V ' 

for some 6 k > 0, then, for all i > 1 and \i € V(E kAp ), 

sup E ( {R p k . k+i n - R k+1 . k+i R k fJ>, f) ) < 2(a fe+ i . . . a fe+i )(6/- + i . . . b k+i )8 k 
/:|l/l|oc=l v ' 

where the expectation is with respect to the distribution of Rk- 

Proof. Using the same ideas as above, one has, for / : £(fc+i-p+i)A P ^ [ 0) l] ; 

K^k-.k+it 1 Jfe+l:fc+t J I ~p 5p m ~p 6 ,,ClV 

7fc+l:ft+i^fcM(l) 7fc+l:ft+i^fcM(lj 

In order to use inequality l|3.6p . compute 



= E 



(Rln - R k n)(d\ k _ p+1) +.. k ) 



fc+i 



< 6^+1 . . . b k+i S k 
where / is defined as 



Yi Ql(^l-l>d^l)^(\l-p+l)+:l)f(^(k+i-p+-L)+:k+i) 

(Riv-Rkri(f) 



l=k+l 



f(\k-p+l) + :k) = / Y\. Ql(^l-^>^l)f(\k+i-p+l)+:k+i)<' i - 
J l=k+l 

and conclude by noting that 

/k+i 
(R p kl i)(dX (k _ p+1)+ .. k ) [] Oi(Ai-i,dA,)*f(A (I _ jH . 1)+!l ) 
1 7.. I 1 



l=k+l 



> 



1 



Ctfe+1 ■ • • a fc+i 

since R^fj, is a probability measure. 



□ 
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4 Mixing and contraction properties of the trun- 
cated filter 



The truncated filter may be interpreted as a standard filter based on Markov 
chain A£ = A( n _ p+1 )+. n . This insight allows us to establish the contraction 
properties of the truncated filter. 

Lemma 3. One has: 

h(R p k+1:k+p v, Rl+uk+pV') < -J— IIm-m'||tv 

and 

HK+v.k+pV, K+i-.k+p^') < Pk+ij,h(jj,,fj,') 

where 

<r2 i _ ?2 

xl _ H n - k ' P 

b k,p ~ I \/i r. \' Pk,p — -2 ' 

{a k ...a k+p ^2)\p k ...b k+p - 2 ) 1 + £ k, P 

for all k>0, and all fx ,n' £ p^C** 1 )^). 

Note £fc jn must be interpreted as a mixing coefficient, and /5^ p as a Birkhoff 
contraction coefficient. 

Proof. Using Hypothesis 3, one has: 

Qk+plk+l:k+p~lH 

k+p k+p— I 

= J V(d\k-p+l)+:k) J\ Qt(Ai-l,dA») J| [*f(A(i_ p+ l)+ :i ) 

„ fc+p 
< 6 fc+ i . . . 6 fc+p _i / /i(dA(/-_p + i)+ : fe) Y\_ Qi{\-i, dXi) 

bk+l ■ ■ ■ bk+p-1 



i=k+l 



< 



-£p(d\k+l:k+p) 



£k+l 

where £ p stands for the following reference measure: 

k+p 

£p(d\k+i:k+p) = £(dXk+i) J][ Qi(Xi-i,d\i). 

i=k+2 

One shows similarly that 

Qk+plk+l:k+p-lH > — £,p{d\k+l:k+p) ■ 

O-k+l ■ ■ ■ dk+p-1 

Hence kernel Qk+vlk+i-.k+v -iP* is mixing, with mix i ng coe fficient £&+i, p . 
Following Lemma 3.4 in Le Gland and Oudiane 1 20041 ). 

HRk+l-.k+pVs Rl+Uk+pV ) = h(Qk+p%+l:k+p-\H,Qk+p%+l:k+p-lH) 

e k+l,p 
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using the scale invariance property of the Hilbert metric. Similarly, according 
to Lemma 3.9 in the same paper: 

H^k+l:k+p^ ^k+Upt 1 ) = HQk+plk+l-.k+p-l^Qk+plk+l-.k+p-l^) 

□ 



5 Propagation of truncation error 

We establish first the two following lemmas. 

Lemma 4. Let R n : E nAp — > r p(E( n+1 * >Ap ) be a sequence of (possibly random) 
probability kernels such that for all n > 1 and \x S V(E nAp ), 



sup E { (RP n fx - R n ^ f) } 

/:||/ll~ = l L J 



<5n 



where the expectation is w.r.t. the randomness of R n , then, for all n > 1 and 
all ( € V(E), one has 



q n 

sup E{{Rl n t-R Un tJ}}<-—S2 

/:||/IU = 1 1 J l0 g( 3 ) ~i 



log(3) ^ \ e?+i£?+ P +i , 



:J 1 J Pi+jp+l,p 



where R\ :n Q — R n ■ ■ ■ R\Q, and with the convention that empty products equal 
one. 

Proof. The following difference can be decomposed into a telescopic sum: 

n 

R l:nC ~ Rl-.nC = ^2 \Ri+l:n^H ^X:t-lC ~ #i+l:n-Rj-frl:i-lC) ■ 
i=l 

We fix the integers i, n, and consider some arbitrary test function /. For 
i > n — 2p, one may apply Lemma [2] 

SUp E { (^l:n^f^l:i-lC - R! +1 .. n RiRv.i-l{, f) } 

< 2(ai+i...a n )(bi+i...b n )6i 
8 & 



log(3) £i+i t p£i+p+i lP 



since e„ < 1, a n > 1 and b n > 1 for all n. 

For i < n — 2p, let /c = — i)/p\ > then, using Lemma [3l Equations (12.21) 
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to (12.51) one has 



< 
< 

< 



R i+l:n R i R l-i-lQ ~ R P +i :n RiRl:i-lC 

2 



TV 



log(3) 



h {R p t+1:i+k pR P iRi-.i-iC,, R p i+1 , i+kp RiRi:i-iC, 



lo g (3)£| +p+1)P jl=2 



fe-i 

_Q Pi+jp+l,p x 



- rl i+l:i+p' / ^i+l-.i+p" 



TV 



where v = i?f i?i : i_iC 5 ^' = R%R\:i-\Q- Applying (7) p. 160 of lLe Gland and Oudiane 
(2004), one gets 



v 



Jx i+l:i+p u ^i+V.i+p 1 - 



< 2 



Wli+VA+p 11 ~ 7i+l:i +p v'\\TV 



TV 



li+l:i+p V ( l ) 



where, using the same calculations as in Lemma [3J 



7f+l:. i+p Kl) > 



O-i+l ■ ■ ■ O-i+p 



and 



\~p ~ P /ii 

\li+l:i+p v ~ li+l:i+p v Wtv 



E 



x'EEp 



< 



< 



sup 

x£EP Jx'eEP 

'i+i ■ ■ ■ bi+p 



x£Ep 
"fi+l:i+p( X i dx ) 



sup E(|<i/-i/,0)|) 



sup E(\(v-v',<f>)\) 

II0IU = 1 



which ends the proof. 

Lemma 5. For all n > 1 and all ( S V(E), one has 

i , l(n—i)/p]-l 



□ 



< 



4 T p 



TV log 3 



n 



_j c i+l,p j = 1 

with the convention that empty sums equal zero, and empty products equal one. 
Proof. One has: 

n 

i? p „C - HPR 1:n ( = (R P i+1 .. n R^Hl 1 R i:i -iC R p i+Un H?R 1:i C 
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For i < n — p, let k — \Jn — i)/p\ , then according to Lemma [3j 



TV 



< 



log(3)e? +1 >p 



fe-i 

n 



Pi+jp+l,p 



B*Hl_ 1 R Ui -it-HfR 1 *C 



TV 



and ones concludes using Lemma [TJ For i > n — p, one can apply Lemma [T] 
directly. □ 



6 Coupling of particle approximations 

We now introduce two interactive particle systems: the first particle system 
approximates the true filter, and is equivalent to the type of particle algorithms 
studied in this paper, and the second particle system approximates the truncated 
filter, and corresponds to an artificial algorithm that would not be implemented 
in practice. We work out a way of coupling both particle systems in order to 
evaluate the distance between the two (in a sense that is made clear below). 
We define, for n > 0, 



(n-p) 



-:n-l) ^\n-p+l)+:n 



xQ n (\„^i,d\' n ), 

Qn(^0:n-l,dX' . n ) = 6\ 0:n _ 1 (dX' Q . n _ 1 ) X Q n (A n _i , dX n ) . 

R+, W £ M+{EP), V 



We define £ M+(E n+1 ), V measurable / : E n+1 
measurable g : £("+!) A ?> -> M+, 



* p i/(a) = v ni 



For any measurable space (E',W) and any measure fi' £ V(E'), we can take 
Zi, Z2, ■ ■ ■ i.i.d. of law fjt' and define the random empirical measure, for N > 1, 



Notice that, as the Z\. Z%, . . . are only given in law, we only define S N (n) in 
law. We define the random operators R% , R% N (Vn) by: V/x £ V(E n ), R% fi, is 
a random weighted empirical measure such that 



Similarly, V/x' S T'(E p ^ n ), R^ N fi' is a random weighted empirical measure such 
that 

i&V = ^.^(Q^O ■ (6.7) 
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As pointed above, /i and R% N /j,' are only defined in law. Since £ denotes the 
probability density of the first state Ao, the particle system with N particles 
approximating the true filter at time n is defined by 



n n -"-n-l 



and the particle system with N particles approximating the truncated filter at 
time n is defined by 



R^R 



n-1 



.R^C 



Lemma 6. There exists a coupling such that, for all k > 1 and n £ P(E k ): 



sup E 

/:||/IU<1 



{Rl N Hi_^-H{R^,f) 



Pr>N, 



< <f> k T P . 



As H^R^ fi and R P ' N H^pb are defined to be random variables with such 
and such law, the term "coupling" means that we can define a random variable 
(HkR^ (i, R^' N Hk/j) with the desired marginals. 

Proof. To prove the above result, we produce a coupling between the two ran- 
dom measures Rf.' H^_ 1 fi and H P R^ /i. Let 

^n{^0:n) = ( \n-p+l) + :n ) ) 

so that, for fj, £ r P(E k ), and using (|6,7p . one has 

Hm k .(S N (Q k v)) 



R P,N H P u 



in the sense that both sides define the same distribution. Let xi, . . . ,Xn i-i-d- 
~ fJ-Qk, where Xi is a vector Xo-.k,i, for i — 1, . . . , N, and Xi denotes its projection 
on the p last components, Xi 



\k-p+l)+:k,ii then 



JV 



£j=i*k(XjO 1~1 



^k(Xi)^xi nas same law as H p Rk A* 



and 



N 



For any / such that ||/||oo < 1 (using a classical result on empirical measures): 
\{Rl' N H p _ l( ,-H p R^J)\ 



2_. ^k(Xi)^xi has same law as R^ N H%_ 1 i 



< 



< 



1 N 



i=l 
N 



*fc(Xi) 



**(Xi) 



E 



i=l 



^k{Xi) -*fc(xO 



< 



E 



Ef=i* fc (xi 



*fc(xi)Eli(*fc(^)-**(xj)) 



{Ef =1 {£?=i 



*fe(Xi)Ef=i*fe(Xi)A*(xi) 



Ef =1 ^te) {Ef =1 *(xi)} {Ef =1 Mx,)}, 



<4>kT P 

using Hypothesis 2, from which we deduce the result. 



□ 
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7 Main result 



We are now able to derive estimates of the error 



£ P nN {yi:n)= sup E N ({(HPR^C - HPR? :n (J)\ \Y 0:n = y 0:n ) 

/:||/||oo = l 

induced by the particle approximation of the true filter, for the marginal fil- 
tering distribution of the p last states, provided p < n. The expectation Eat 
is with respect to the randomness of the N particles, and the functions / are 
g(n+i)Ap _^ ^ ]\f te that pf(yi:n) ls D Y construction an increasing function 
of p. 

Theorem 7. For any C £ V(E), and any test function w.r.t. E^ n+1 ' Ap , 



4 n 6- 



where 



l ~i+l,p^i+p+l,p 

x q PA j 4a ' :6 ' 



L(n-i)/pJ-l 

JJ Pi+JP+l 
J=2 



(7., 



iV 



Proof. We first study the following local error, for // G V(E n ) 



sup Eat 

/=ll/lloo = l 



^0:n — 2/0 :ra 



where the difference of operators can de decomposed into: 

To bo und the first term, one may use (25) p. 162 of Le Gland and Oudjand 
(|2004h . for v = H^_ Y \ji and Hypothesis 3: 



E 



< 



N 



and, for the second term, one may apply Lemma El 



N . 



< 



so that 



sup Eat 
MI/IL=i 



'R^-HPR^J^ 
for S' n = 2a n b n /\fN + 4> n T p . This local error is propagated using Lemma HI 



-■A 



L^J-i 



< 



log(3) 



E 



n 



~2 ~2 

e i+l £ i+p+l ~ = ~^ 



Pi+jp+l,p 



To conclude, one may decompose the global error as follows: 

where the second term is bounded above, and the first term is directly bounded 
using Lemma [5j □ 
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Since p is an arbitrary parameter, one may minimise the error bound with 
respect to p. For instance, one has the following result for time-uniform esti- 
mates. As noted above, the error £^ N (yi :n ) is an increasing function of p, so the 
bound below applies a fortiori to E\ N (yi-. n ), the particle error corresponding to 
the marginal filtering distribution of the last state A„ . 

Corollary 8. If there exists constants c, e, (f> > such that, almost surely, 
o-nb 7 i < c, e n > e, and 4> n < <j>, then, provided tc 3 < 1, the particle error is 
bounded almost surely as follows: 



8%{ Vl .. n )<C{\og{N) + D} 
for N large enough, where 



1 



1+3 log c/ log T 



16 



-1 



and 



° e 6 c 2 (ylogrj 

p = log 



3 log c/ log -3 



4c 



, ^ = 21og(30/4cr), 
/logr . 



Proof. Under these conditions, the RHS of l|7.8p is smaller than or equal to: 



(7.9) 



£l N {yi:n) < 



< 



< 



< 



4 


c 2(p-2) 


log 3 


e 4 


4 


c 2(p~2) 


log 3 


e 4 


4 


c 2(p-2) 



30r p 



log 3 e 4 

4 c 3(p-2) 



L(n-i)/pJ-2 

(7.10) 



i/p-1 



3 T p + 4l 



l-e z c 



2^-(p-2)) 



iVy i_(i_ 



e 2 c -(p-2) 



i/p 



3*^ + ^1* 



for p large enough, since (1 — x) a < 1 — ax for a S (0, 1), at € (0, 1), so, provided 
c 3 r < 1, one may take p as in (|7.9p . which gives: 



30 



-2 logr 



and conclude. 



□ 

Obviously, this is a qualitative result, in that there are many practical models 
where such time-uniform, deterministic bounds are not available. For specific 
models, one may be able instead to use (|7,8p in order to establish the asymptotic 
stability of the expected particle error, where the expectation is with respect to 
observed process (Y n ). We provide an example of this approach in Section [HI 
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8 Applications to practical models 



In this section, we apply our general result to three practical models. We keep 
the same settings and notations, i.e. the observed process (Y n ) admits some 
probability distribution conditional on the path A 0: „ = Xo-. n of a Markov chain 
(A„), with initial distribution £ and Markov transition Q n , which fulfil Hypoth- 
esis 1, see Section [21 We derive conditions on the model parameters that ensure 
asymptotic stability of the particle error; in particular, these conditions imply 
that Hypotheses 2 and 3 are verified. 

We state the following trivial result for further reference. Let (/, <?) a pair 
of probability densities (/, g) on E, then: 

\lx&E,\\og{f(x)}-\og{g{x)}\<c 

=> VxeE,\f(x)-g(x)\<(e c -l){f(x)Ag(x)} (8.11) 

for c > 0. 



8.1 GARCH Mixture model 

We assume that the observed process is such that 

Y n = <T n (A 0:n )Z n , n > 1, 

where the Z n 's are i.i.d. 7V(0, 1) random variables, and the variance function 
<j\ is defined recursively, for n > 1: 

a 2 n (X 0:n ) = a(X n ) + /3(A„)y„ 2 + 7(A„K 2 1 _ 1 (A 0: „_ 1 ) (8.12) 

and crg(Ao) = Q<(Ao)/ {1 — 7(^0)} , where a, (3 and 7 are E — > K + functions. 
Conditional on A 0:rl , (Y n ) is a GARCH (gen eralis ed autor e gressi ve conditional 
heteroskedasticity) process ijBollerslevl Il986h : see IChopinl (|2007h for a finance 
application of such a GARCH mixture model. 
The potential functions equal 



*n(Ao:n) = -j = =r exp 



V27r ( j2(Ao:„) I 2a2(A 0: „ 

for Ao : n G E n+l , and (A„) is a Markov process, with Markov kernels Q n , which 
satisfy Hypothesis 1. 

The functions a, (3 and 7 are assumed to be bounded as follows: 

< a min < a(\) < < /3 min < /3(A) < /3 

max < 1, 

< 7min < < 7max < 1. 

We first consider the case where /3(A) = for all A G E. As mentioned 
in the introduction, this simplified model can be interpreted as a standard 
hidden Markov model, with observed process (Y n ), and Markov chain (n n ) — 
(A„, a\ (Ao:n)) • However, since ct 2 (A 0: „) is a deterministic function of cr^_i(Ao :ra -i) 
and A„, it does not have mixing or similar properties that are usually required 
to obtain estimates of the particle error. Instead, analysing this model as a 
Feynman-Kac flow with iterative, path-dependent potential functions make it 
possible to derive such estimates. 
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Lemma 9. For the simplified model described above (with (3 — 0), the expected 
particle error of the corresponding particle approximation is uniformly stable in 
time, i.e. there exists constants C ' D, such that 



E 



£i N (Yl:n) <C{\og(N) + D} 



1 



1+3 log c/ log 



where p is given by $7.9\) . provided i < 2 and tc < 1, where t = 7 n 



(2/t- iy 1/2 , and 



i = 



^max (1 Tmin) 
^min (1 Tmax) 



Proof. From (|8.12j) . one sees the process cr 2 is bounded, cr^ in < t7 2 (Ao : n) < cr. 
for all A 0:n £ E n+1 , where 



2 

max 



l-7n 



l"7n 



so, for a given sequence observations yx :n , Hypothesis 3 is verified with: 



1 



1 



2 

max 



exp 



!)7, 



bn = 



1 



y/T? 



: exp 



2(7,2. 



provided the truncated potential is taken as: 

*£(A„_ p+ l:„) = *„(z, ...,z, An— p+l:n) 

where 2 is an arbitrary element of E. For Hypothesis 2, one has, for any 
A 0:n , A^ :Jl e E( n+1 ^ such that A ( „_ p+1)+: „ = A' (n _ p+1)+:n : 

|log* n (A 0: „)-log* n (A^„)| < -|logcr 2 (A 0: „)-logcr 2 (A 0: „)| 

,,2 



>J7, 



1 



1 



CT„(A 0:n ) er 2 (A^ :n ) 



< 



^niin + Vn 12 



2<i„ 



where cr 2 , is contracting, in the sense that, for n> p, 



kn(^0:n) - CT 2 (AQ :rl )| 



'p-l >j 
J|7(A n _i) > |cr 2 _ p (A 0: „_ p ) - cr 2 _ p (A 0:n _ p )| 

,i=0 J 



< 27 p a 2 

— /max max 



Thus, using l|8.1ip . and the fact that (e x — l)/x is an increasing function, Hy- 
pothesis 2 is verified with r = 7 max and 



rV 2 fcr 2 . +w 2 ) 
max V nun 1 

mm 



exp 



for any q < p. Finally, to compute the expectation with respect to process (Y n ) 
of the error bound Q7.8J1 . one may use repetitively the following results: 

E [exp (ar 2 ) |Yi : „-x] < (1 - 2ac£ ax ) ~ 1/2 
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for a < l/2cr max , using standard calculations and the fact that Y n , conditional 
on Yi : „_i and Ao : „ = Ao :n is Af (0, <?n(Ao :n )) . This implies in particular that: 

E[a n 6 n |Yi:„_i] < 2-f»-l 



where the constant c is well-defined since cr^^/cr^^ < 2, then by Jensen in 
equality, 

E ' 



Yl:n- 



>c-\ 



and similarly, 

-1/2 



E[^,|y 1:n _i]<r-« 



{2 \ / 4 \ ~ l / z 

1-2t«^ -1 
<i„J V Kin J 



where <fi is properly defined for q large enough. Using the above results recur- 
sively on the sum on the RHS of Q7.8J1 . one obtains the same expression as in 
(|7.10p for the error bound than in Corollary [8] for time-uniform estimates (with 
the values of c, <f>, r as defined above), and concludes similarly. □ 

If (3 is allowed to take positive values, stability results may be obtained under 
more restrictive conditions. In particular, one may impose that 7 is a constant 
function. 

Lemma 10. For the general mixture G ARCH model defined above, the expected 
particle error is uniformly stable in time, i.e. there exist constants C , D, such 
that 



E 



£ P n , N (Y 1:n ) <C{log(N)+D}[ — 



N 1+3 log c/ log 7 



provided 7 is a constant function, 7(A) = 7, tc 3 < 1, § < 2, where t = 7, 
c = (2/i? — 1) 1 , p is given by $7.9\) . and 



$ = 1 amax v P' 



max 



*min Mmin 



0n 



Proof. We follow the same lines as above, except that the bounds of the process 
^ni^o-.n) must be replaced by: 

2 7™ 2 k 

1 — 7 L — ' 



^maxW = Y~ amaX + H( amax + P^yl- k )l k , 



n-1 



7 

fe=0 



which, by construction, are such that 

Cx(n) 



< d < 2. 



O'minW 

Hence, one has again 

/ 2 X-l/2 

E[a n 6 n |F 1:n _!]< 2-f^-l 
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and the rest of the calculation is identical to those of previous Lemma, with 

T = 7. □ 



8.2 Mixture Kalman model 



We focus on an univariate linear Gaussian model, i.e. conditional on Markov 
process (A„), one has X = almost surely, and, for n > 1, 



X n = h(A n )X n ^i + y/w{k n )W n , 

Y n = X n + y/v(An)V n , 

where the V n 's and the W n 's are independent Af(0,l) variables, and h, v, 
w are real-valued functio ns. Using the recursions of the Kalman-Bucy Filter 
(jKalman and Bucvl . 119611 1. one is able to marginalise out the process X n , and 
compute recursively the probability density of Y n , conditional on Ao : „ = Aq :ti , 
in the following way: 



*n(A 0: „) = 



1 



V /27TCr2(Ao:„) 



exp 



{Un - /J-n(^0:n)Y 
2<t2(A 0: „) 



where, the following quantities are defined recursively: for n > 1, 

Mn(Ao.-n) = /i(A„)m„_i(A 0:n -i) 
O^(Ao.-n) = /l(A n ) 2 C„_i(A 0: „-l) + v(X n ) + w(A„) 
a„(A 0:n ) = {/i(A„) 2 c n _i(A :„-i) + w(\ n )} /crl(X a . n ) 

m n (\ 0:n ) = MA n )m„-l(A :„_l) + a n (\ :n) {l/n - y"n(A :n)} 



C n (Ao:r 



(8.13) 
(8.14) 
(8.15) 
(8.16) 

MA„) 2 c„_i(A 0: „-i) + w(X n ) - a n (A 0: „) 2cr n(Ao:r l ) (8.17) 



and m (A ) = c (A ) = 0. 

We make the following assumptions: 

1. Functions v and w are bounded as follows: for all A 6 E, 

< v < v(X) <v, < w < w(X) < w. 

2. Function h is bounded as follows: for all A S E, 

\h(X)\ <~h<l 

We first prove the following intermediate results. 

Lemma 11. The sequence cr 2 is bounded and uniformly contracting, i.e. for all 
P > 1; for all Xo;n> X' . n , such that A n _ p +i : „ = X' n _ p+1 . n , one has 

<L 2 < a 2 (A 0:n ) < a 2 |a 2 (A 0: „) - a 2 (A^ : J| < 

where a 2 = v + w, a 2 = (h 2 + l)v + w, C a = h 2 v/T a , and 

1 

< 1. 



1 + w/v + 2y 'w/v + w 2 /v 2 
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Proof. From (|8,17p . one deduces that 
1 1 



(8.18) 



C ti (A :„) V(\ n ) /l(A„) 2 C, i _l(A 0: „-l) + w(A„) 

thus ] 

- + — < C n (X :n) < V 

\v w J 

and, from (|8. 14|) . that 

v + w < <rl(\o:n) < (h 2 + l)v + w. 
In addition, (|8.18p implies that 

log{c„(A 0: „)} = T (log{c n _i(A : ri -i)} , A„) 

where 

T{c > x) = - log {ix) + h(\rJ +W (x) 

It is easy to show that, for a fixed A, the derivative of T (c, A) with respect to c 
is bounded from above by t ct as defined above. Thus, T (c, A) is a contracting 
function, and, by induction, for n > p, 

\&n(^0:n) - <Jn(K:n)\ = I H A„ ) | 2 | C„_i ( A 0: „_l ) - C„_i (Aq.„_ 1 ) | 

< h 2 v |logC„_l(A :„_l) - logc„_i(Ao : „_ 1 )| 

where r a and C a were defined above. □ 

Lemma 12. The sequence fi n is bounded and contracting in the sense that there 
exists C M > such that, for all p > 1, for all n > p, and Xo-.m Aq.„, such that 

A n -p+l:n = Ki-p+lmt 0ne haS 

:n) f-^7i 

1 — an 

where 

M n = max (\yi\) , T = T a Vh, a=[l 

i— l,...n \ 

Proof. Note first that 

w ( 
1 - 5 = _ = < a„(A 0: „) < 1 ■ 



v 

a 



h 2 v + w + v J v + w 



h 2 v + w + v 

so one shows recursively, using 13f) and l|8.16p . that: 

. . . ah 

|Mn(Ao:n)| < zr M n-l 

1 — an 
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and that, for A 0:n , Aq.„ such that A„_ p+ i : „ = X' n _ p+1 . 7 



.19) 



1=1 



i=i j'=i 



2/i p+1 



where a„_i, aj 1 _ i are short-hands for a„_i(An. : n-i), a n ^i(X' . n _ i ). The sequence 
o n itself in contracting, since, from (|8,15p . one has, for i < p: 



— ~~i \ a n-ii^O:n-i) ~ &n— i-1 iK:n-i) I 





< 


V , 













„4 'a 



so (|8.19p and the fact that \xy — x'y'\ < \x — x'\ + \y — y'\ provided x, x' , y, y' € 
[0, 1] leads to 



\^n{^0:n) ~ Mn(Ao :n )l 



P—I 



— i=l 



i=l 



2h p+1 



+ 1 



for t — r a W h, and a well chosen value of C M . 
We are now able to state the main result. 



□ 



Lemma 13. For the model above, the particle error is bounded uniformly in 
time, i.e. there exist C, D, such that 



£ p n , N (yv. n ) < C {\og(N) + D} { — 



1 



1+3 log c/ log i 



almost surely, for p given by (7.9]) . provided the realizations y n are bounded, i.e. 
\Un\ < C y for all n > 1, and that tc 3 < 1, with r = ft, V r a and 



c = — exp 
er 



ah 



1 — ah 



1 + w/v + + w 2 /v 2 



< 1. 



Proof. This proposition is a direct application of Corrolary [HI so we need only 
to prove that Hypotheses 2 and 3 are fulfilled. For Hypothesis 2, one may take 



1 



1 



V27 



exp 



4. 



ah 



1 — ah 



b n = 



1 



so that a n b n < c for c defined above. For Hypothesis 3, one has: 
2|log*„(A 0;n )-log* n (A£, : „)| < |log^(A :„)-log<72(A' 0:n )| 



+ 



{y n - Ain(A 0: „)} 2 {y n - Mn(A' 0:n )} 2 



O-«(A0:, 



°ra(<\):r. 
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where the first term is such that 



1 



|log<j2(A 0: „) - log cr^A^Jl < — \al{X Q .. n ) - al(X' 0:n )\ 



- a 2 a 



according to Lemma HH and the second term is such that 



{Vn - Mn(Ao:«)} 2 {Vn - (K:n)Y 



< 



C«( A 0:n) 
1 

a n( A 0:n 



a n{K:n) 



{Vn - Ain(A :„)} 2 - {Vn ~ Vn{K:n)Y 



< 



, {Vn - ^n{Xa:n)} I 2 f\ \ „2f\l \| 
~* TT\ \ 2I\I V \ a n\ A 0:n ~ &n\ A 0:7i \ 
<7„(Ao:n)o-„(Ao.„) 



1 



ah \ 2C ' C a 
t p H —, — 



1 — ah / 

and one concludes using l|8.1ip and taking 



1 — ah 



= (p n = exp • 



C a C^Cy 
2^ + 



1 



1 — ah 



CyC a 



all 



1 — ah 



□ 

Obviously, the boundness condition on the realizations y n is not entirely 
satisfactory, as the generating process of (Y n ) is such that Y n should leave any 
interval eventually. However, Y n is marginally a Gaussian variable with variance 
uniformly bounded in time (since h < 1), so this remains a reasonable approx- 
imation if C y is large enough. Generalizing the above result to more general 
conditions is left for future research. 



8.3 Application to standard state-space models 

Consider a 'standard' state-space model, based on a linear auto-regressive state 
process (X n ): 

X n = pX n _ l +A n , Ai,...,A„,... i.i.d. (8.20) 

for t > 0, p € (—1,1) and Xq = Aq, and an observed process (Y n ), with 
conditional density, with respect to an appropriate dominating measure, and 
conditional on X n — x n , given by the potential function ^^(x n ). 

In this section, we show how to apply our stability results to such a standard 
state-space model, where the potential function depends only on the current 
state X n . We rewrite the model as a state space model with hidden Markov 
chain (A„), and observed process (Y n ) corresponding to potential function 

\fe=0 / 

where the argument x n in the right hand side has been substituted with the 
appropriate function of Xo-. n , as derived from <|8.20[) . 
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Clearly, the reformulated model satisfies Hypothesis 1: the (A„) are i.i.d., 
hence they form a Markov chain with mixing coefficient e n = 1. If we assume 
that the ^ n {^o-.n) are such that Hypotheses 2 and 3 hold as well, then we can 
apply directly Theorem [7J However, the path-dependent formulation of this 
model is artificial, and, in practice, we are interested in filtering the process X n , 
conditional on the Y^s, rather than filtering the A„'s, again conditional on the 
y„'s. More precisely, we wish to approximate the conditional expectation of 

(p-l n 
k=0 k=p 

for some bounded function g, and, provided g is also Lipschitz, with constant 
K, and that the A„'s lie in interval [— I, I], for some I > 0, one has: 



— , — , ( V ~ X 

E P kX n-k + P kX n-k I - 9 I E P kX "- 
>fc=0 k=p I \fc=0 



< T f 

~ 1 - T 



where r = \p\. Therefore, we must consider an additional term in the par- 
ticle error attached to the filtering of (X n ), which stems from the difference 
between the filtering distribution of X n and that of A„_ p+ i :n , for some integer 
p. Consider the following estimate of the particle error for functions of X n : 

£ n,N(Vl:n) = Sup E N (\{RunC - R V.nC, fg)\ l*0:n = VO-.n) 

g:\\g\\oa=l,9ELip(K) 

where Lip(K) denotes the set of Lipschitz functions with Lipschitz constant K, 
and f g is the function E n+1 — > R such that 




/s(Ao:n) = .9 ^2p k K-k , 



i.e., loosely speaking, f g {\o-.n) — g( x n), where x n must be substituted by its 
expression as a function of Ao :n - 

Lemma 14. For the state-space model described above, one has, for any n> p, 

Kl 

£X N (yi:n)<£l N (Vl:n) + — TP - 



Taking into account this additional error term, we can derive time-uniform 
estimates of the stability of the particle algorithm. For the sake of space, we 
focus on the following simple example: Y n e { — 1,1}, Y n = 1 with probability 
1/(1 + e x "), Y n = —1 otherwise. The potential function (for the model in its 
standard formulation) equals: 

*n(*n) ' 



1 + e y n x„ 



We recall that the support of the (A„) is [— I, I], and therefore X n € [— I', V] 
almost surely, with V = 1/(1 — t). Thus, Hypothesis 3 holds for b n = l/(l + e~ ( ), 
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a n = 1 + e l . For Hypothesis 2, standard calculations show that, for two vectors 
A 0: „ and X' . n such that \ n - P +i:n = ^'n-p+i-.ni one nas 



|log* n (A 0: „)-log*„(A( ):n )| < 



k—p 

< 2l'r p 

provided r = \p\. Hence, using l|8.1ip inequality, Hypothesis 2 holds, with 
<t>„ = e 21 ' - 1. 

For this specific model, we have the following result. 

Lemma 15. For the specific model described above, and provided ct 3 < 1, where 
t = \p\, c — e l , one has: 



£* N (y 1:n )<C{log(N)+D} 



1 



1+3 log c/ log- 



E 

N 



where C and D were defined in CorollaryUH 4> — e 21 — 1, and E = 4Kl'c/3cf>. 

The above model does not fulfil the usual conditions required in standard sta- 
bility results, see e.g. Del Moral ( 20041 Section 7.4.3), because the Markov chain 
(X n ) is not mixing. Thus, it is remarkable that the time-uniform stability of 
this model is established using a Feynman-Kac formulation with path-dependent 
potentials. 



9 Conclusion 

To extend our results to a broader class of models, three directions may be 
worth investigating. First, it may be possible to bound directly the particle 
error, without resorting to a comparison with an artificial, truncated potential 
function. It seems difficult however to avoid some form of truncation, as the path 
process A 0:n itself does not benefit to any sort of mixing property, while fixed 
segments A„_ p+ i : „ do. Second, one may try to loosen Hypothesis 1 (Markov 
kernel is mixing) and Hypothesis 3 (pote ntial function is bounded) , using for 



instance lOudjane and Rubenthalerl (|2005[ )'s approach. Third, it seems possible 



to adapt our general result on the particle error bound to several models not 
considered in this paper, in particular standard models with potential functions 
depending on the last state only, by using and extending the approach developed 
in the previous Section. 
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